Skip to content

Fix exception to distance constraints in dprog.c#108

Open
althonos wants to merge 1 commit intohyattpd:GoogleImportfrom
althonos:patch-6
Open

Fix exception to distance constraints in dprog.c#108
althonos wants to merge 1 commit intohyattpd:GoogleImportfrom
althonos:patch-6

Conversation

@althonos
Copy link
Copy Markdown
Contributor

To restrict computation to a banded diagonal of the dynamic programming matrix the dprog function computes the offset of the furthest node to consider when scoring a node. In the case of giant ORFs, this limit must sometimes be increased to make sure the complete ORF is always accounted for. In the current code, this is done in a faulty way and causes connections to be sometimes recomputed for the entire contig, because min is always reset to zero in such cases.

I tested the patch with 50 genomes and the results were the same to the main branch (according to the scores file output); however on a large genome (Sorangium cellulosum) the patch saves 9s (40%) of runtime:

Benchmark 1: ./prodigal -i GCF_000067165.1_ASM6716v1_genomic.fna -o /dev/null
  Time (mean ± σ):     15.744 s ±  0.924 s    [User: 15.639 s, System: 0.069 s]
  Range (min … max):   14.556 s … 16.704 s    4 runs
 
Benchmark 2: ./prodigal_orig -i GCF_000067165.1_ASM6716v1_genomic.fna -o /dev/null
  Time (mean ± σ):     22.440 s ±  1.216 s    [User: 22.296 s, System: 0.076 s]
  Range (min … max):   21.249 s … 24.135 s    4 runs
 
Summary
  ./prodigal -i GCF_000067165.1_ASM6716v1_genomic.fna -o /dev/null ran
    1.43 ± 0.11 times faster than ./prodigal_orig -i GCF_000067165.1_ASM6716v1_genomic.fna -o /dev/null

althonos added a commit to althonos/pyrodigal that referenced this pull request Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant